Nowadays, fake news easily propagates through online social networks and becomes a grand threat to individuals and society. Assessing the authenticity of news is challenging due to its elaborately fabricated contents, making it difficult to obtain large-scale annotations for fake news data. Due to such data scarcity issues, detecting fake news tends to fail and overfit in the supervised setting. Recently, graph neural networks (GNNs) have been adopted to leverage the richer relational information among both labeled and unlabeled instances. Despite their promising results, they are inherently focused on pairwise relations between news, which can limit the expressive power for capturing fake news that spreads in a group-level. For example, detecting fake news can be more effective when we better understand relations between news pieces shared among susceptible users. To address those issues, we propose to leverage a hypergraph to represent group-wise interaction among news, while focusing on important news relations with its dual-level attention mechanism. Experiments based on two benchmark datasets show that our approach yields remarkable performance and maintains the high performance even with a small subset of labeled news data.
translated by 谷歌翻译
Multivariate time series forecasting constitutes important functionality in cyber-physical systems, whose prediction accuracy can be improved significantly by capturing temporal and multivariate correlations among multiple time series. State-of-the-art deep learning methods fail to construct models for full time series because model complexity grows exponentially with time series length. Rather, these methods construct local temporal and multivariate correlations within subsequences, but fail to capture correlations among subsequences, which significantly affect their forecasting accuracy. To capture the temporal and multivariate correlations among subsequences, we design a pattern discovery model, that constructs correlations via diverse pattern functions. While the traditional pattern discovery method uses shared and fixed pattern functions that ignore the diversity across time series. We propose a novel pattern discovery method that can automatically capture diverse and complex time series patterns. We also propose a learnable correlation matrix, that enables the model to capture distinct correlations among multiple time series. Extensive experiments show that our model achieves state-of-the-art prediction accuracy.
translated by 谷歌翻译
Deep neural networks (DNNs) are sensitive and susceptible to tiny perturbation by adversarial attacks which causes erroneous predictions. Various methods, including adversarial defense and uncertainty inference (UI), have been developed in recent years to overcome the adversarial attacks. In this paper, we propose a multi-head uncertainty inference (MH-UI) framework for detecting adversarial attack examples. We adopt a multi-head architecture with multiple prediction heads (i.e., classifiers) to obtain predictions from different depths in the DNNs and introduce shallow information for the UI. Using independent heads at different depths, the normalized predictions are assumed to follow the same Dirichlet distribution, and we estimate distribution parameter of it by moment matching. Cognitive uncertainty brought by the adversarial attacks will be reflected and amplified on the distribution. Experimental results show that the proposed MH-UI framework can outperform all the referred UI methods in the adversarial attack detection task with different settings.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
The combination of transformers and masked image modeling (MIM) pre-training framework has shown great potential in various vision tasks. However, the pre-training computational budget is too heavy and withholds the MIM from becoming a practical training paradigm. This paper presents FastMIM, a simple and generic framework for expediting masked image modeling with the following two steps: (i) pre-training vision backbones with low-resolution input images; and (ii) reconstructing Histograms of Oriented Gradients (HOG) feature instead of original RGB values of the input images. In addition, we propose FastMIM-P to progressively enlarge the input resolution during pre-training stage to further enhance the transfer results of models with high capacity. We point out that: (i) a wide range of input resolutions in pre-training phase can lead to similar performances in fine-tuning phase and downstream tasks such as detection and segmentation; (ii) the shallow layers of encoder are more important during pre-training and discarding last several layers can speed up the training stage with no harm to fine-tuning performance; (iii) the decoder should match the size of selected network; and (iv) HOG is more stable than RGB values when resolution transfers;. Equipped with FastMIM, all kinds of vision backbones can be pre-trained in an efficient way. For example, we can achieve 83.8%/84.1% top-1 accuracy on ImageNet-1K with ViT-B/Swin-B as backbones. Compared to previous relevant approaches, we can achieve comparable or better top-1 accuracy while accelerate the training procedure by $\sim$5$\times$. Code can be found in https://github.com/ggjy/FastMIM.pytorch.
translated by 谷歌翻译
Semantic Change Detection (SCD) refers to the task of simultaneously extracting the changed areas and the semantic categories (before and after the changes) in Remote Sensing Images (RSIs). This is more meaningful than Binary Change Detection (BCD) since it enables detailed change analysis in the observed areas. Previous works established triple-branch Convolutional Neural Network (CNN) architectures as the paradigm for SCD. However, it remains challenging to exploit semantic information with a limited amount of change samples. In this work, we investigate to jointly consider the spatio-temporal dependencies to improve the accuracy of SCD. First, we propose a SCanFormer (Semantic Change Transformer) to explicitly model the 'from-to' semantic transitions between the bi-temporal RSIs. Then, we introduce a semantic learning scheme to leverage the spatio-temporal constraints, which are coherent to the SCD task, to guide the learning of semantic changes. The resulting network (ScanNet) significantly outperforms the baseline method in terms of both detection of critical semantic changes and semantic consistency in the obtained bi-temporal results. It achieves the SOTA accuracy on two benchmark datasets for the SCD.
translated by 谷歌翻译
社会过程的持续数字化转化为时间序列数据的扩散,这些数据涵盖了诸如欺诈检测,入侵检测和能量管理等应用,在这种应用程序中,异常检测通常对于启用可靠性和安全性至关重要。许多最近的研究针对时间序列数据的异常检测。实际上,时间序列异常检测的特征是不同的数据,方法和评估策略,现有研究中的比较仅考虑了这种多样性的一部分,这使得很难为特定问题设置选择最佳方法。为了解决这一缺点,我们介绍了有关数据,方法和评估策略的分类法,并使用分类法提供了无监督时间序列检测的全面概述,并系统地评估和比较了最先进的传统以及深度学习技术。在使用九个公开可用数据集的实证研究中,我们将最常用的性能评估指标应用于公平实施标准下的典型方法。根据分类法提供的结构化,我们报告了经验研究,并以比较表的形式提供指南,以选择最适合特定应用程序设置的方法。最后,我们为这个动态领域提出了研究方向。
translated by 谷歌翻译
摄像头捕获的文档图像通常会遭受透视和几何变形的影响。在考虑视觉不良美学和OCR系统性能不断恶化时,纠正它们是很大的价值。最近的基于学习的方法将重点放在精确的文档图像上。但是,这可能不足以克服实际挑战,包括具有大边缘区域或没有边缘的文档图像。由于这种不切实际,用户在遇到大边缘区域时努力进行裁剪。同时,没有边距的脱瓦图像仍然是一个无法克服的问题。据我们所知,仍然没有完整有效的管道来纠正野外文档图像。为了解决这个问题,我们提出了一种称为Marior的新方法(删除边缘和\迭代内容纠正)。马里奥(Marior)遵循一种渐进策略,以粗到精细的方式迭代地改善脱水质量和可读性。具体而言,我们将管道分为两个模块:边缘去除模块(MRM)和迭代内容整流模块(ICRM)。首先,我们预测输入图像的分割面膜以删除边缘,从而获得初步结果。然后,我们通过产生密集的位移流以实现内容感知的整流来进一步完善图像。我们可以适应地确定改进的迭代次数。实验证明了我们方法在公共基准测试方面的最先进性能。资源可在https://github.com/zzzhang-jx/marior上获得,以进行进一步比较。
translated by 谷歌翻译
由于其在隐私保护,文档修复和文本编辑方面的各种应用,因此删除文本引起了越来越多的关注。它显示出深度神经网络的重大进展。但是,大多数现有方法通常会为复杂的背景产生不一致的结果。为了解决此问题,我们提出了一个上下文引导的文本删除网络,称为CTRNET。 Ctrnet探索了低级结构和高级判别上下文特征,作为指导背景恢复过程的先验知识。我们进一步提出了具有CNNS和Transformer-编码器的局部全球含量建模(LGCM)块,以捕获局部特征并在全球像素之间建立长期关系。最后,我们将LGCM与特征建模和解码的上下文指南合并。在基准数据集,Scut-Enstext和Scut-Syn上进行的实验表明,CTRNET显着胜过现有的最新方法。此外,关于考试论文的定性实验也证明了我们方法的概括能力。代码和补充材料可在https://github.com/lcy0604/ctrnet上获得。
translated by 谷歌翻译
时间序列数据的积累和标签的不存在使时间序列异常检测(AD)是自我监督的深度学习任务。基于单拟合的方法只能触及整个正态性的某些方面,不足以检测各种异常。其中,AD采用的对比度学习方法总是选择正常的负面对,这是反对AD任务的目的。现有的基于多促进的方法通常是两阶段的,首先应用了训练过程,其目标可能与AD不同,因此性能受到预训练的表示的限制。本文提出了一种深层对比的单级异常检测方法(COCA),该方法结合了对比度学习和一级分类的正态性假设。关键思想是将表示和重建表示形式视为无阴性对比度学习的积极对,我们将其命名为序列对比。然后,我们应用了由不变性和方差项组成的对比度损失函数,前者同时优化了这两个假设的损失,后者则防止了超晶体崩溃。在四个现实世界中的时间序列数据集上进行的广泛实验表明,所提出的方法的卓越性能达到了最新。该代码可在https://github.com/ruiking04/coca上公开获得。
translated by 谷歌翻译